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Discussions of the foundations of statistical mechanics, how they lead to thermodynamics, and the 
appropriate definition of entropy have occasioned many disagreements. I believe that some or all of 
these disagreements arise from differing, but unstated assumptions, which can make opposing 
opinions difficult to reconcile. To make these assumptions explicit, 1 discuss the principles that have 
guided my own thinking about the foundations of statistical mechanics, the microscopic origins of 
thermodynamics, and the definition of entropy. The purpose of this paper will be fulfilled if it paves 
the way to a final consensus, whether or not that consensus agrees with my point of view. © 2011 
American Association of Physics Teachers. 
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I. INTRODUCTION 

“Nobody really knows what entropy really is.” 

— John von Neumann 1 

Since I began speaking and publishing on the relation be- 
tween statistical mechanics and thermodynamics in general 
and the meaning of entropy in particular, 2-7 I’ve encountered 
a diversity of opinion among experts that is remarkable for a 
field that is well over a century old. Most colleagues with 
whom I have discussed the matter have indicated that they 
believe their views are essentially the same as those of the 
majority of physicists. However, when we discuss details, 
opinions turn out to be quite diverse and, at times, 

. .• 4,8 

contentious. 

The following is a partial list of opinions I have encoun- 
tered in the literature and in discussions with other scientists: 

• The theory of probability has nothing to do with statistical 
mechanics. 

• The theory of probability is the basis of statistical mechan- 
ics. 

• The entropy of an ideal classical gas of distinguishable 
particles is not extensive. 

• The entropy of an ideal classical gas of distinguishable 
particles is extensive. 

• The properties of macroscopic classical systems with dis- 
tinguishable and indistinguishable particles are different. 

• The properties of macroscopic classical systems with dis- 
tinguishable and indistinguishable particles are the same. 

• The entropy of a classical ideal gas of distinguishable par- 
ticles is not additive. 

• The entropy of a classical ideal gas of distinguishable par- 
ticles is additive. 

• Boltzmann defined the entropy of a classical system by the 
logarithm of a volume in phase space. 

• Boltzmann did not define the entropy by the logarithm of a 
volume in phase space. 

• The symbol W in the equation S=k log W, which is in- 
scribed on Boltzmann’s tombstone, refers to a volume in 
phase space. 

• The symbol W in the equation S=k log W, which is in- 
scribed on Boltzmann’s tombstone, refers to the German 
word “Wahrscheinlichkeit” (probability). 

• The entropy should be defined in terms of the properties of 
an isolated system. 
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• The entropy should be defined in terms of the properties of 
a composite system. 

• Thermodynamics is only valid in the “thermodynamic 
limit,” that is, in the limit of infinite system size. 

• Thermodynamics is valid for finite systems. 

• Extensivity is essential to thermodynamics. 

• Extensivity is not essential to thermodynamics. 

This remarkable diversity of opinion has an interesting 
consequence. When people discuss the foundations of statis- 
tical mechanics, the justification of thermodynamics, or the 
meaning of entropy, they tend to assume that the basic prin- 
ciples they hold are shared by others. These principles often 
go unspoken, because they are regarded as obvious. It has 
occurred to me that it might be good to restart the discussion 
of these issues by stating basic assumptions clearly and ex- 
plicitly, no matter how obvious they might seem. This paper 
is a start in that direction. 

There are two possible reactions to the principles I put 
forward. A reader might agree with them. In that case, we 
would have a firm basis on which to proceed. Or, a reader 
might take issue with one or more. In that case, we would 
know where the conflict lies, which would give us a good 
chance of resolving points of disagreement. In either case, 
we should be able to make progress toward arriving at a 
consensus, which is the goal of this paper. 

Because my topic is limited to macroscopic measurements 
of macroscopic systems, I will discuss what I understand 
those terms to mean in Sec. II. In this paper I will put for- 
ward 12 principles based on the concept of macroscopic 
measurements that have led me to advocate the use of Bolt- 
zmann’s 1877 definition of the entropy 11 over other defini- 
tions that are often found in textbooks. 


II. MACROSCOPIC SYSTEMS 

In this paper I am concerned with the question of how to 
describe the observed behavior of macroscopic systems. The 
concept of macroscopic frames all of my arguments, so it is 
important to make clear at the outset how I define it. A mac- 
roscopic system contains a large number of particles, and a 
macroscopic measurement is limited in its resolution. These 
two features are closely related, in that what can be regarded 
as a large number depends on the resolution of the macro- 
scopic measurements. 

The reason for specifying a large number of particles is 
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that the quantities of interest in thermodynamics are collec- 
tive variables, such as the energy or the number of particles 
in a system. The relative statistical fluctuations of such quan- 
tities are generally inversely proportional to the square root 
of the number of particles. If the statistical fluctuations are 
much smaller than the resolution of the macroscopic mea- 
surements, they can be ignored; the average values obtained 
from statistical mechanics then give a description of the ex- 
pected results of the experiment. 

In the late 19th century, when Boltzmann and Gibbs did 
their seminal work, the existence of atoms had not been 
proven. The idea of experimentally observing atomic behav- 
ior was scarcely considered. Now, it is possible to obtain 
images of microscopic structure with atomic resolution. Nev- 
ertheless, I am restricting my attention in this paper to mac- 
roscopic measurements that cannot discern microscopic be- 
havior in order to discuss the emergence of a thermodynamic 
description from statistical mechanics. 

In the remainder of the paper I will give the rationale for 
each of the principles I have followed, and I will show how 
they lead to the adoption of Boltzmann’s 1877 definition of 
the entropy in terms of the logarithm of the probability of 
macroscopic states for a composite system. 

in. PROBABILITY OF MACROSCOPIC STATES 

Principle 1: Probability theory is necessary for a theoret- 
ical description of macroscopic behavior. 

The first — and most fundamental — principle is that the ba- 
sis for obtaining a description of a macroscopic system from 
microscopic laws of motion is given by probability theory. In 
any experiment (real or gedanken), the system is in some 
specific microscopic state (quantum or classical) at any given 
instant. That microscopic state is a property of the system, 
independent of measurement. 

The most immediate consequence of the limited resolution 
of macroscopic measurements is that it severely restricts our 
knowledge of the microscopic state of a system. We cannot 
determine the microscopic state experimentally — we can 
only eliminate microscopic states that are not consistent with 
our macroscopic observations. 

The limitations on our knowledge bring us to the distinc- 
tion between reality and our knowledge of reality. The reality 
is the microscopic state of the system at any given time. Our 
knowledge of reality consists of the information we obtain 
from macroscopic measurements and the conclusions we are 
able to draw from that information. We can only construct a 
representation or description of the behavior of the system; 
we cannot know the microscopic state of a system from mac- 
roscopic measurements. 

In quantum systems our knowledge is even more limited. 
For example, except for eigenstates, which have probability 
zero, the energy is not even determined uniquely by the mi- 
croscopic state, so it cannot be a property of the system 
independent of measurement. 

The most useful method I know for describing limited 
knowledge is Bayesian probability theory, 9 which led me to 
the first principle. 

After deciding to use probability theory, there remains the 
choice of which probability distribution to use. The most 
reasonable choice would seem to be the simplest that is con- 
sistent with what we know from macroscopic observations. 
Therefore, I take the probability distribution (a Baysian prior 
or model probability) to be uniform in phase space for iso- 


lated classical systems (subject to constraints on the total 
energy and the restriction of the particles to certain volumes), 
and correspondingly uniform over microscopic states of 
quantum systems. The logical consequences of such prob- 
ability distributions are known to lead to predictions that 
agree with experiment, which is comforting. 

Principle 2: Probability theory is sufficient for a theoreti- 
cal description of macroscopic states. 

In one sense, the introduction of probability distributions 
very nearly completes the theory of many-body systems. 
Little else is essential. The concepts of entropy, free energy, 
etc. are extremely convenient, but they are not absolutely 
necessary. We could calculate anything and everything about 
the behavior of macroscopic systems without ever mention- 
ing them. 

This principle is very important because it implies that 
however we define concepts like entropy and free energy in 
statistical mechanics, the consequences of the definitions 
must be consistent with the predictions of probability theory 
if they are to have the properties required by thermodynam- 
ics. 

IV. COMPOSITE SYSTEMS 

Principle 3: Statistical mechanics and thermodynamics 
must predict the properties of composite systems. 

An essential part of statistical mechanics and thermody- 
namics is the analysis and prediction of the behavior of com- 
posite systems. A simple isolated system in equilibrium does 
not do anything macroscopically measurable. You can’t even 
make an experimental determination of its temperature with- 
out putting a thermometer in contact with it, and then you 
have a composite system. 

A simple container full of gas must also be regarded as a 
composite system if we want to investigate questions such as 
whether the density of the gas is uniform. Without concep- 
tually dividing the system into smaller subsystems, we can- 
not discuss density variations. 

An important feature of a composite system is that it can 
have internal constraints between its subsystems. The release 
of internal constraints can lead to measurable changes, which 
can be predicted by statistical mechanics and thermodynam- 
ics. 

Although I don’t expect serious disagreement on this prin- 
ciple, it does lead to a different emphasis than the usual 
textbook discussion. It is common to define thermodynamic 
functions for isolated systems and only much later consider 
equilibrium in composite systems. I believe that because of 
the crucial importance of composite systems, they should 
play a leading role in the development of statistical mechan- 
ics and thermodynamics. 

Section V will discuss the measurement of extensive pa- 
rameters, which are quantities that are proportional to how 
much of something there is in a system. Examples include 
the energy and the number of particles. The prediction of the 
measured values of extensive parameters is a key step in 
linking statistical mechanics to thermodynamics. 

V. PREDICTIONS OF THERMODYNAMIC 
QUANTITIES 

Principle 4: The values of extensive parameters that maxi- 
mize the probability predict the results of measurements of 
those parameters for composite systems in equilibrium. 
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This principle provides the key link between statistical 
mechanics and thermodynamic measurements. 

When a constraint in a composite system is released, mea- 
surable quantities can change. As an example, consider a 
composite system consisting of two subvolumes separated by 
a partition, each containing some amount of the same type of 
ideal gas. Each subvolume contains on the order of 10 2() 
particles, and our measurement apparatus can resolve the 
density of the gas to an accuracy of about 10 -5 . If a hole is 
punched in the partition, the density of the gas in each sub- 
volume will go to approximately the same value, within fluc- 
tuations of the order of 10 -10 . Because the fluctuations are 
much smaller than the resolution of our measurement appa- 
ratus, we can take the location of the maximum of the prob- 
ability distribution to predict the experimental outcome. This 
feature strongly supports Principle 2; probability theory is 
sufficient to predict macroscopic behavior. 

Similar examples can be given for releasing constraints on 
the energy (using walls that conduct heat) or volume (using a 
freely moving piston to separate the sub volumes). In each 
case the probability distribution is very narrow, so that the 
fluctuations cannot be observed by macroscopic measure- 
ments. The extremely small relative fluctuations of macro- 
scopic observables are so universal that, in the 19th century, 
many of Boltzmann’s opponents didn’t believe in their exis- 
tence. 

Although nonequilibrium behavior after the release of 
constraints is both interesting and important, the discussion 
here is limited to equilibrium states, which are discussed in 
Sec. VI. 


VI. EQUILIBRIUM 

Principle 5: A macroscopic equilibrium state is defined by 
two properties: the probability of macroscopically observ- 
able changes is extremely small, and there is no macroscopi- 
cally observable flux of energy or particles. (This property 
distinguishes equilibrium from steady state.) 

There might be some disagreement on this point. There is 
a substantial literature in statistical mechanics that makes the 
fundamental assertion that equilibrium is defined by a par- 
ticular “equilibrium probability distribution” in phase space 
(or Hilbert space). 

In my opinion, such a view is a serious error, primarily 
because the probability distribution of the microscopic states 
is not macroscopically observable. We use probability theory 
because we cannot discern microscopic states; we certainly 
cannot measure the relative frequency with which they occur. 

If we limit the definition of equilibrium to behavior that 
can be observed, it follows that there are many probability 
distributions that all make the same predictions. 10 The sim- 
plest probability distribution is the uniform distribution, but 
it is not unique. 

It is traditional to define a number of thermodynamic func- 
tions to facilitate the analysis of macroscopic systems in 
equilibrium. Although Principle 2 implies that these func- 
tions are not absolutely necessary, they are such convenient 
descriptions of macroscopic behavior that it would be unrea- 
sonable to do without them. Their general nature is discussed 
in Sec. VII. 


VII. THERMODYNAMIC PREDICTIONS 

Principle 6: The predictions of statistical mechanics and 
thermodynamics are representations or descriptions of a sys- 
tem based on the extent of our knowledge. 

This principle again reflects the distinction between reality 
and our knowledge of reality, between properties of a system 
and a description or representation of measurable quantities 
based on our limited knowledge. 

As an example of this distinction, consider again a com- 
posite system consisting of a box containing a gas, with a 
partition dividing the box into two equal subvolumes. The 
partition has a small hole in it, so that molecules of the gas 
can move between the two subvolumes. At any instant of 
time, there is some specific number of particles on each side 
of the partition. Thermodynamics predicts a number of par- 
ticles that give the same density on both sides of the parti- 
tion. The predicted number turns out to agree with experi- 
ment to within the limited resolution of macroscopic 
measurements. For this reason, thermodynamics provides a 
very useful description of the behavior of a macroscopic sys- 
tem. 

In contrast, the actual number of particles on each side of 
the partition at any instant cannot be the number that is pre- 
dicted. The actual number is not determined for quantum 
systems without measurement, and even for classical sys- 
tems, it fluctuates with time. The predicted number is a de- 
scription based on our knowledge and is constant in time. It 
is very useful for human purposes, but it is not a real prop- 
erty of the system. 

It is sometimes claimed that the predicted number of mol- 
ecules in each subvolume is a real property of the system if 
we regard it as an average over the course of an experiment. 
How long would the observation time have to be for such a 
claim to be true? Consider an open system with about 10 20 
particles in equilibrium and a corresponding statistical uncer- 
tainty of about 10 10 particles. To reduce the statistical uncer- 
tainty of the mean to about one particle, we would need at 
least 10 20 independent observations. If the correlation time 
for the system is about 1 ms, this would take 10* 7 s, which is 
comparable to the age of the universe. Even with such a long 
observation time, we would still not have an exact result 
because the average number of particles is generally not an 
integer. For any reasonable experiment during the lifetime of 
a physicist, the prediction of thermodynamics is in error by 
an enormous number of particles and should not be confused 
with the actual number of particles. 

For the same reasons, the energy, the entropy, and the 
associated free energies are thermodynamic descriptions 
rather than real properties of a macroscopic system. The en- 
tropy is actually defined at a higher level of abstraction than 
the energy or the number of particles. That is the subject of 
Sec. VIII. 

The distinction between real properties of a system and 
our knowledge of the system might seem philosophical and a 
bit pedantic, but it greatly clarifies some issues that might 
otherwise be rather puzzling. 


VIII. ENTROPY 

This section considers the controversial question of what 
“entropy” means and how to define it. Principle 7 is based on 
the most important of the thermodynamic properties of the 
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entropy,”’ which leads to both the thermodynamic condi- 
tions for equilibrium and the second law of thermodynamics. 

Principle 7: The primary property of the entropy is that it 
is maximized in equilibrium. 

Because the macroscopically observable behavior of an 
isolated system in equilibrium does not change with time, the 
maximization of the entropy cannot be applied to a simple 
system. It can be applied to a composite system: simply re- 
lease a constraint and see what happens. If the definition of 
the entropy is correct, the location of the maximum of the 
entropy should predict the observed equilibrium values of 
extensive macroscopic observables. 

Principle 7 also leads directly to the second law of ther- 
modynamics. If the entropy is always maximized in equilib- 
rium for a composite system, then the change in entropy after 
a constraint is released cannot be negative. 

If we compare Principle 7 with the predictions of probabil- 
ity theory, we see that the location of the maximum of the 
entropy must always coincide with the location of the maxi- 
mum of the probability distribution. 

An immediate consequence of Boltzmann’s 1877 defini- 
tion of the entropy as the logarithm of the probability distri- 
bution for macroscopic observables is that the location of the 
maximum of the entropy always coincides with the equilib- 
rium values of those macroscopic observables. If any other 
definition is used, it requires a separate demonstration to 
show that it also predicts these values correctly. 

The automatic agreement of the predictions of Boltz- 
mann’s definition of the entropy with the correct equilibrium 
values of macroscopic observables makes it the natural 
choice. It might be possible to define the entropy differently, 
but the predictions of any alternative definition must be iden- 
tical to those of Boltzmann’s definition in terms of the loga- 
rithm of the probability. 

Principle 7 completes what I regard as a convincing argu- 
ment in favor of Boltzmann’s 1877 definition of the entropy. 

The remainder of the paper takes up issues that are asso- 
ciated with the concept of entropy. Their purpose is both to 
introduce the remaining principles that have guided my 
thinking on these issues and to complete the picture pre- 
sented so far. 


IX. ADDITIVITY 

Principle 8: Additivity is essential to any consistent defi- 
nition of the entropy of a system with short-ranged interac- 
tions between its particles. 

In thermodynamics it is generally assumed that the en- 
tropy of a composite system is given by the sum of the 
entropies of the subsystems. This property is known as “ad- 
ditivity.” 

For Boltzmann’s 1877 definition of the entropy, the valid- 
ity of the assumption of additivity is based on the short range 
of molecular interactions, which is much smaller than the 
dimensions of the system. Only a very small fraction of the 
particles in one subsystem interact with those in another sub- 
system, so that the sum of all such interaction energies is still 
relatively small. If the direct interactions between sub- 
systems can be neglected, the entropy satisfies additivity. 

As an aside, using Boltzmann’s definition of the entropy 
suggests the alternative of referring to this property as “sepa- 
rability,” because the entropy of a composite system is de- 
fined first. 


If we were to use a definition of the entropy that did not 
satisfy additivity and nevertheless wanted to have correct 
results for composite systems, we could assign an arbitrary 
function — or simply the value zero — as the entropy of any 
subsystem. The entropy of a composite system could then be 
obtained by adding an extra term to recover the Boltzmann 
expression. It is possible to create such a formalism, but 
none of the usual expressions for temperature, pressure, or 
chemical potential in terms of partial derivatives of the en- 
tropy would be necessarily valid. Without additivity, we 
would not have thermodynamics as we know it. 

The importance of additivity probably would go without 
saying if it were not for a suggestion that an otherwise in- 
correct definition of the entropy might be saved by an extra 
term for composite systems. 8 I don’t see any virtue to such a 
procedure, and I stand by Principle 8. 

X. THE THERMODYNAMIC LIMIT 

The thermodynamic limit is defined as the infinite-size 
limit of the ratios of extensive quantities — ratios such as the 
energy per particle Ul N or the particle density N/V. The 
advantage of taking the limit of infinite size is that uncertain- 
ties in these ratios go to zero because the relative fluctuations 
are generally proportional to 1 / \7V. 

Principle 9: The thermodynamic limit is not required for 
the validity of thermodynamics. 

To judge from some textbooks, this principle might be the 
most controversial of the ones discussed in this paper. 

However, the thermodynamic limit is misnamed. It is not 
essential to the foundations of thermodynamics. It cannot be 
essential if we are to apply thermodynamics to real systems, 
which are necessarily finite. We never do experiments on 
infinite systems. If thermodynamics worked only for infinite 
systems, it might still be interesting as mathematics, but it 
would be irrelevant as science. 

The thermodynamic limit is mathematically convenient 
for certain problems. Phase transitions, for example, only 
exhibit nonanalytic behavior in the thermodynamic limit, 
which makes for a much cleaner mathematical description. 
Nevertheless, the thermodynamic limit should not play any 
essential role in the foundations of statistical mechanics and 
thermodynamics. 

XI. DISTINGUISHABILITY AND 
INDISTINGUISHABILITY 

Principle 10: “Indistinguishability” is a property of micro- 
scopic states. It does not depend on experimental resolution. 

In my opinion, this principle should be an obvious conse- 
quence of the definitions found in any textbook on quantum 
mechanics. However, I have had enough arguments about it 
to know that it is far from obvious. 

The definitions of distinguishability and indistinguishabil- 
ity are simple: (1) If the exchange of two particles in a sys- 
tem results in a different microscopic state, the particles are 
distinguishable. (2) If the exchange of two particles in a sys- 
tem results in the original microscopic state, the particles are 
indistinguishable. (For fermions, two states are usually re- 
garded as identical if they differ only by an overall minus 
sign.) 

The definition of indistinguishability does not have any- 
thing to do with the interactions between particles. It is pos- 
sible in either quantum or classical physics for two distinct 
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states to have the same energy. Nevertheless, if the micro- 
scopic state is different after the exchange of two particles, 
those particles are distinguishable. 

Unfortunately, “distinguishable” is sometimes confused 
with what might be called “observably different.” Two par- 
ticles are observably different if exchanging them alters the 
properties of the system in a way that is observable. Clearly, 
if particles are observably different, they must be distin- 
guishable. In contrast, particles can be distinguishable with- 
out their exchange producing any observable differences. 

A simple example of this distinction is provided by a mix- 
ture of 3 He and 4 He. It would not be possible for a macro- 
scopic measurement to detect the difference in the micro- 
scopic states that would result from exchanging a single ’He 
atom with a single 4 He atom. Nevertheless, there would be a 
difference in the microscopic states, and the two isotopes of 
helium are not mutually indistinguishable. 

The term “identical particles” is often used as a synonym 
for indistinguishable particles. This use has the unfortunate 
consequence that a model of classical distinguishable par- 
ticles with identical properties might be mistaken for a model 
of indistinguishable particles. 

The concept of indistinguishability is foreign to classical 
mechanics. Consider the trajectory of an isolated classical 
system in phase space (the 6/V-dimensional space defined by 
the positions and momenta of all particles in the system) in 
which the microscopic state is described by a point. If two 
particles are exchanged at a given time, the trajectory be- 
comes discontinuous. The exchange of particles has resulted 
in a different microscopic state, regardless of whether the 
Hamiltonian gives the same energy for the two microstates. 

In quantum mechanics A'-particle states of indistinguish- 
able particles are characterized by a wave function that has 
been symmetrized (or antisymmetrized) by summing over all 
permutations of the particles, with a change in sign for each 
permutation for fermions, or without a change in sign for 
bosons. 

A classical system of indistinguishable particles can be 
described by the same procedure. The microscopic state of a 
classical system of indistinguishable particles would be de- 
scribed by the N\ points in phase space found from the set of 
all permutations of the particles. The trajectory (or trajecto- 
ries) of the set of N\ points is clearly unaffected by the ex- 
change of any two particles at any point in time. 

The idea of representing a classical state by Nl points in 
phase space is a bit odd, but that is because indistinguish- 
ability is not a classical concept. However, if indistinguish- 
ability is to be imposed on a classical system, this represen- 
tation seems to be the most reasonable way of doing it. 

Many textbooks claim that classical systems with distin- 
guishable and indistinguishable particles are described by 
different expressions for the entropy. However, it is straight- 
forward to demonstrate that the macroscopic properties of a 
classical system are exactly the same whether the particles 
are distinguishable or indistinguishable. 2 Since the macro- 
scopic behaviors of classical systems with distinguishable 
and indistinguishable particles are the same, it seems natural 
that their entropies should also be the same, which leads to 
my next principle. 

Principle 11: Systems with identical macroscopic proper- 
ties should be described by the same entropy. 

Boltzmann’s 1877 definition of the entropy gives the same 
expression for the entropy for classical systems with either 


2 

distinguishable or indistinguishable particles. The tradi- 
tional definition in terms of a volume in phase space, which 
is often erroneously attributed to Boltzmann, 5 gives different 
expressions, at least one of which must clearly be incorrect. 
The worst failings of the traditional definition of the entropy 
for a system of distinguishable particles are that it violates 
the second law of thermodynamics and makes incorrect pre- 
dictions for equilibrium with respect to the exchange of par- 
ticles between subsystems." 

The error in the traditional definition of the entropy of a 
classical system of distinguishable particles also has the con- 
sequence that it predicts that the entropy of an ideal gas is 
not extensive. This problem is not really fundamental, but it 
has bothered people. And it leads to the next principle. 

XII. EXTENSIVITY 

Principle 12: Extensivity is not essential to thermodynam- 
ics. 

Extensivity is the property that the macroscopic observ- 
ables of a system are all directly proportional to its size. This 
property implies that ratios, such as U IN, VI N, and SIN, are 
all independent of the size of the system. In many textbooks, 
extensivity is taken to be a fundamental postulate of 
thermodynamics. 12 It is certainly convenient mathematically, 
because it leads directly to the Euler and Gibbs-Duhem 
equations. It is an appropriate assumption when the physical 
properties of a material are being investigated, and the sur- 
face or interface contributions can be neglected. 

However, real systems have surfaces and interfaces, which 
are important topics of research. Because the surface-to- 
volume ratio changes with the size of the system, real sys- 
tems are not extensive, and the deviations from extensivity 
can be very important. For example, a real gas in a real 
container will usually be adsorbed to some extent on the 
inner walls of the container. At low temperatures, the fraction 
of adsorbed molecules can be quite large, which is exploited 
in the construction of cryopumps. 

To describe the thermodynamics of a surface, we must be 
able to describe the thermodynamics of a nonextensive sys- 
tem and extract the parts of the free energy, etc. that are not 
directly proportional to the size. Therefore, statistical me- 
chanics and thermodynamics must be applicable to nonex- 
tensive systems. 

Recognizing that extensivity is not an essential property of 
thermodynamic systems is important in deciding on an ap- 
propriate definition of entropy. Some colleagues claim that a 
definition of entropy that gives a demonstrably incorrect ex- 
pression can be made acceptable by imposing extensivity 
with an additional term of the form -k B ln(A/I). However, 
because thermodynamics should also correctly describe non- 
extensive systems, that is, systems with entropies that cannot 
be made extensive by a term that depends only on N, such a 
correction is not feasible. 

There is also another difficulty in trying to impose exten- 
sivity on the fundamental definition of the entropy. If the 
system under consideration contains more than one kind of 
particle, the criterion of extensivity is ambiguous. For ex- 
ample, suppose we have a gas mixture of distinguishable 
particles, with N A particles of type A and N B particles of type 
B. The common textbook definition of the entropy as the 
logarithm of a volume in phase space gives an answer that is 
not extensive (and incorrect for other reasons"). We might try 
to impose extensivity with the addition of either 
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—k B ln(A^ \N b \) or —k B ]n[(A' /1 +A' /j )!]. The first choice is the 
one we want, of course, but the criterion of extensivity does 
not eliminate the second. If this path were to be taken, at 
least one more principle would have to be invoked to obtain 
an unambiguous definition. 

Although extensivity is a useful assumption when analyz- 
ing the properties of a material, rather than a system with 
surfaces, it is not essential to either thermodynamics or sta- 
tistical mechanics, and should not be included as part of the 
definition of entropy. 


XIII. CONSEQUENCES OF THE 12 PRINCIPLES 

The principles I have given have led me to the conclusion 
that Boltzmann’s 1877 definition of the entropy as the loga- 
rithm of the probability of macroscopic states for composite 
systems is superior to any other proposed definition. In par- 
ticular, it is superior to a definition in terms of a volume in 
phase space that is often found in textbooks for classical 
statistical mechanics. 

If the principles I have presented in this paper are correct, 
any other valid definition of entropy must turn out to be 
equivalent to defining entropy in terms of probability. 


XIV. GIBBS’ PARADOX 

Alternatives to Boltzmann’s 1877 definition of the entropy 
have led to problems that have been debated for over a hun- 
dred years. The debate has centered on Gibbs’ paradox, 
which refers to a set of old problems in statistical 
mechanics. 13 The two main problems concern the properties 
of the entropy of systems of distinguishable particles. In my 
opinion, they are both easy to resolve on the basis of the 
principles I have given. 


A. Extensivity 


The first version of Gibbs’ paradox concerns the properties 
of the entropy as defined in terms of the logarithm of a vol- 
ume in phase space. Boltzmann’s 1877 definition in terms of 
the logarithm of the probability of a composite system does 
not have this problem. 

If U is the energy, V is the volume, and N is the number of 
particles, the volume in phase space (often denoted by 12) 
consists of all points for which N particles are in a container 
of volume V with a total energy less than or equal to U. For 
an ideal gas, this volume is given by 


fl=V A 


^ Na 

T(3A/2 + 1) 


u 


3NI 2 


(1) 


If the entropy is defined in terms of the logarithm of this 
volume in phase space, 

S n = k Inn, (2) 


Stirling’s approximation gives an expression for the entropy 
of the form 


S n (U,V,N)=Nk B 


3 U\ 

-In - +ln V+lnX , 
2 \N 


(3) 


where X is a constant that can be calculated from Eq. (1). 

This expression for the entropy, Eq. (3), is not extensive. 
As explained in Sec. XII, I do not regard the lack of exten- 


sivity as a problem in itself. However, Eq. (3) leads to a 
violation of the second law of thermodynamics. 6 That is a 
problem! 

Consider an ideal gas of N particles in a volume V, and 
assume that the entropy before inserting the partition is given 
by Eq. (3). Now insert a partition that divides the system into 
two equal volumes. The total entropy after inserting the par- 
tition is given by twice the entropy of a system half the size 
of the original one, 


2S n (U/2,V/2,N/2) 
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The change in S (l is 

A S n = 2 Sq(U/2, VI2,NI2) - S n (U, V,N) = - Nk B In 2. (5) 

The decrease in entropy predicted by the entropy in Eq. (3) 
violates the second law of thermodynamics as expressed in 
the Clausius inequality, 14 



where i and / refer to the initial and final macroscopic states, 
before and after inserting the partition. Because dQ=0 while 
the partition is being inserted, the Clausius inequality is vio- 
lated by Eq. (5). This violation eliminates a definition of the 
entropy in terms of the logarithm of a volume in phase space 
from consideration as the entropy of a classical gas. 6 

Boltzmann’s definition of entropy in terms of the loga- 
rithm of the probability gives exactly the same result for 
classical particles whether they are distinguishable or not,” 


S B (U,V,N) = Nk B 
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Because Eq. (7) for the entropy does not violate the second 
law of thermodynamics, there is no paradox and no problem. 


B. Continuity 


Another problem, which is also known as Gibbs’ paradox, 
concerns the desire for continuity as the interactions between 
particles in a system go continuously from being measurably 
different to being the same for all particles. 

For example, consider a classical ideal gas with N A par- 
ticles of type A and N B particles of type B. All particles of a 
given type have the same properties, but these properties are 
different for type A and type B particles. The entropy of this 
system differs from the entropy of an ideal gas of N=N A 
+N b particles of a single kind by the amount 


AS = - k B 




+ N b lnl 



> 0 . 


( 8 ) 


Equation (8) is the well-known entropy of mixing. 

The concern is that as the differences in the properties of 
the two types of particles vanish, the entropy of the system 
changes discontinuously by the entropy of mixing given in 
Eq. (8). 

First of all, it is quite possible for the interactions between 
particles to be essentially identical, but to still be able to 
separate them in some way — using differences in diffusion 
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rates for different isotopes, for example. In that case, the 
entropy would not change discontinuously as the differences 
in the interactions vanish. 

However, suppose all differences in interactions, masses, 
etc. could be made to go continuously to zero. At some point, 
the differences would become smaller than the resolution of 
our experiments. Nevertheless, at any level of difference in 
the interactions, we either would or would not be able to 
measure the difference. 

If the entropy were a property of the system (reality) — 
instead of a description of the system (representation of our 
knowledge), as argued in Sec. VII — a discontinuity of the 
entropy would be strange. However, the entropy is given by 
the probability, which is, in turn, related to our knowledge of 
the system. There is no problem with our description (or 
knowledge) of a system changing discontinuously when our 
information changes discontinuously. If we cannot determine 
experimentally that there are two different types of particles, 
then a description that lumps them together will still be cor- 
rect. Common practice lumps the various isotopes of an ele- 
ment together for most thermodynamic applications. Al- 
though different isotopes are clearly distinguishable, the 
macroscopic predictions are not affected. 

The problem of continuity is often expressed in terms of a 
continuous change from distinguishable to indistinguishable 
particles. However, such a change is intrinsically discontinu- 
ous and does not occur simply because the interactions be- 
tween the particles become identical. 

XV. SUMMARY 

I have put forward 12 principles that have led me to con- 
clude that Boltzmann’s 1877 definition of the entropy in 
terms of the logarithm of the probability of macroscopic 
states of composite systems is superior to all other options. 

It would be too much to hope that my arguments will find 
universal agreement. However, I hope that further discus- 
sions will be clarified by an improved understanding of one 
point of view. Those who might have different points of view 
have the opportunity to express which of the principles they 
object to and present their own alternatives. 

The issues I have discussed have been the subject of dis- 
agreements for well over a century. It might be that, in the 


end, the conclusions of the scientific community deviate 
from the principles I have listed here. However, the purpose 
of this paper will be fulfilled if it paves the way to a final 
consensus. 
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